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Introduction 


Assessing school principal performance is both necessary and challenging. It is 
necessary because principal performance assessments offer districts an additional 
mechanism to ensure accountability for results and reinforce the importance of strong 
leadership practices. After all, school principals are second only to classroom teachers 
as the most influential school factor in student achievement (Hallinger & Heck, 1998; 
Leithwood, Louis, Anderson, & Wahlstrom, 2004). Principal performance assessments 
also provide central office administrators and principals, themselves, information 
with which to build professional learning plans and chart professional growth. Such 
assessments are also challenging because principals’ practice and influence on 
instruction is sometimes not readily apparent. 

During the past five years, many states have begun using validated measures in 
summative assessments of novice principal competency as a basis for certification 
decisions. These measures may be psychometrically sound but often cannot be used 
for formative performance assessments or professional development planning (Reeves, 
2005). To be used as a formative performance assessment, test results would have to 
be disaggregated, and their underlying constructs would need to be made transparent to 
readers. In addition, administrative and analytic control would have to be transferred to 
local educators (see “Formative Versus Summative Assessment: What Is the Difference?" 
on page 2). 

Although standardized tests are used for certification purposes, other types of 
assessments are being used by school districts to ascertain principal performance 
and plan professional learning. So, independent of standardized measures, which 
tend to serve summative purposes, other assessments are being used formatively 
to judge principal performance. Scanning the field, Goldring et al. (2009) found that 
school districts often use idiosyncratic and inconsistent measures for principal 
performance assessment. Districts’ principal performance assessments may or may 
not be aligned with existing professional standards, and they often lack justification 
or documentation of psychometric rigor (Heck & Marcoulides, 1996). In other words, 
district performance assessments allow for formative feedback, but the measures vary 
in quality and rigor. This variance opens up the possibility that scores are inaccurate 
or performance assessments do not reflect research-based standards of the field. 

Superintendents and others who seek to improve principal performance assessment 
may select one or more of these measures or may develop and validate their own 
measures. Regardless of origin, assessments should be validated and reliable to 
ensure accuracy and applicability to principal performance. 
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This brief reports results of a scan of publicly 
available measures conducted by Learning Point 
Associates staff* in 2009. The measures included 
in this review are expressly intended to evaluate 
principal performance and have varying degrees 
of publicly available evidence of psychometric 
testing. The review of this information is intended 
to inform decision makers’ selection of job 
performance instruments used for hiring, 
performance assessment, and tenure decisions. 
This brief also addresses the importance of 
standards-based measures, the need for 
establishing reliability and validity, and the 
measures that are more widely accepted and 
psychometrically sound. 

New Standards 
Principal Performance 

Knowledge about what strong principals do to develop and maintain teaching and learning 
excellence has evolved with the changes in the context of schooling and improved school 
leadership research. School principals are being asked to ensure that all students have 
access to high-quality instruction and all educators are held accountable for student 
learning. These tasks deepen and broaden principals’ professional responsibilities 
beyond their traditional roles as building managers. 

New standards for principal performance have emerged and reflect new emphases in 
the profession. The Educational Leadership Policy Standards: ISLLC 2008, for example, 
are a widely recognized and referenced principals standards list (Council of Chief State 
School Officers, 2008). The ISLLC Standards contain six domains for principal 
professional practice: 

■ Setting a widely shared vision for learning 

■ Developing a school culture and instructional program conducive to student learning 
and staff professional growth 

■ Ensuring effective management of the organization, operation, and resources for a 
safe, efficient, and effective learning environment 

■ Collaborating with faculty and community members, responding to diverse 
community interests and needs, and mobilizing community resources 

□ Acting with integrity, fairness, and in an ethical manner 

□ Understanding, responding to, and influencing the political, social, legal, 
and cultural context 
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Formative Versus Summative Assessment: 

What Is the Difference? 

No matter their form, assessments generally have 
two purposes. An assessment used for summative 
purposes tends to inform a decision about the test 
taker’s competence, and there is no opportunity 
for remediation or development after completion. 
An assessment used for formative purposes is also 
a measure of competence, but results are used to 
inform future actions. For example, a formative 
purpose of performance assessment is to inform a 
principal’s professional development plan. A single 
assessment may serve formative and summative 
purposes in different situations. 

The Learning Point Associates scan included only 
publicly available and rigorously tested measures 
that are useful for formative assessment purposes. 


* Learning Point Associates merged with American Institutes for Research in August 2010. 
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As the ISLLC Standards suggest, principals must work within a well-formed ethical code 
to oversee instructional quality; develop teacher talents; establish a learning culture in 
schools; and work within and beyond the school to secure financial, human, and political 
capital to maintain and advance organizational operations. 

The ISLLC Standards have been integrated into many states’ licensure procedures 
through the following means: 

■ Alignment of ISLLC Standards with state principal professional standards 

■ Requirement of all principal candidates to receive a certain score on a standardized 
examination, which has been validated against ISLLC Standards, as a prerequisite 
for certification 

■ Requirement of state-recognized preservice principal preparation programs to 
display and defend how program activities prepare and determine whether 
candidates meet ISLLC Standards 

Less is known about the integration and alignment of ISLLC Standards, other standards 
lists, or other promising leadership practices with principal performance assessments. 


Reliability and Validity 

To be included in the scan, documentation of validity and reliability testing had to 
be published. Such testing provides evidence of psychometric rigor, which should 
be considered by purchasers and users of performance assessments. 

Assessments are considered valid when they measure what they are intended to 
measure. There are many types of validity, but two of the more salient types in 
constructing performance measures are content and construct validity. Content validity 
is established by ensuring that the test items under consideration measure all of the 
dimensions or facets of a given construct, such as principal performance. Content 
validity can be established by linking the test or other items to a set of standards, 
such as the ISLLC Standards, or practices, such as leadership effectiveness. 

Construct validity is determined by the degree to which test items measure a “construct,” 
which is the element that the items purport to assess. For example, a construct may be 
ISLLC Standard 5, “An education leader promotes the success of every student by acting 
with integrity, fairness, and in an ethical manner” (Council of Chief State School Officers, 
2008, p. 15). For this construct, multiple test items or another method for collecting 
evidence would be needed to determine the degree to which the standard is met. In this 
case, testing for construct validity would determine how well items and observations 
measure principals’ abilities to act with integrity, fairness, and in an ethical manner. 

Reliability is a measure of consistency and stability. A measure has reliability when the 
responses are consistent and stable for each individual who takes the test. In other 
words, a principal should receive relatively the same score on multiple administrations 
of a given test if all factors remain the same. 
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The Reviewed Measures 

Of the 20 school principal performance assessment measures identified through 
Google Scholar, eight met preestablished criteria for inclusion in the review (see 
“How Assessments Were Selected for Review”). 

Some measures, such as the ETS School Leadership Series examinations, provided 
extensive documentation of reliability and validity testing but no information about 
the formative use of results in performance assessment, so this measure was not 
included in the review. Other measures, such as the Chicago Public Schools’ principal 
performance rubric, are clearly intended for use during performance assessments, 
but no documentation was available about the validity or reliability of these measures. 

The following principal performance assessments were included in the review and may 

be useful resources for superintendents, human 
resource directors, and others who are charged 
with gauging principal skills and abilities for hiring, 
performance assessment, and tenure decisions. 
Table 1 provides additional information about each 
of the measures included in this review (see p. 7). 

Change Facilitator Style Questionnaire 

Vandenberghe (1988) developed the Change 
Facilitator Style Questionnaire (CFSQ) to measure 
the extent to which leaders can facilitate change 
(see School Administrators of Iowa, 2003). In 
CFSQ, three different approaches have been 
identified as change facilitator styles: initiator, 
manager, and responder. Data are categorized 
into three clusters with two scales/dimensions 
embedded within each cluster: 

■ Cluster 1. Concern for People: Scale 1 
(Social/Informal) and Scale 2 (Formal/ 
Meaningful) 

■ Cluster 2. Organizational Efficiency: Scale 3 
(Trust in Others) and Scale 4 (Administrative 
Efficiency) 

■ Cluster 3. Strategic Sense: Scale 5 (Day-to- 
Day) and Scale 6 (Vision and Planning) 


/ 

How Assessments Were Selected for Review 

Learning Point Associates staff conducted a 
keyword search of Google Scholar to locate 
school principal performance assessment 
instruments. More than 5,000 articles were 
initially identified, but the majority of articles 
were not pertinent. To winnow the list further, 
publicly available performance assessment 
support documents had to report that the 
assessment was 

■ Intended for use as a performance 
assessment. 

e Psychometrically tested for reliability 
and validity. 

□ Publicly available for purchase. 

For the purposes of the review, psychometrically 
sound means that the instrument must be tested 
for validity and reliability using accepted testing 
measures. A minimum reliability rating of 0.75 
must be achieved. Also, content validity and/or 
construct validity testing must have occurred. 

Using these criteria, 20 assessments were 
identified, and eight principal performance 
assessment instruments were included in the 
final review. 
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Diagnostic Assessment of School and Principal Effectiveness 

Ebmeier (1992) developed this measure to identify the strengths of schools and their 
leaders so that school improvement plans and principal professional development goals 
would be better informed. To complete the assessment, separate surveys are completed 
by students, teachers, parents, principals, and principal supervisors. The measures 
indicate how these groups view themselves, school leadership, and school performance. 
Multiple measures are completed by multiple groups to identify matches between school 
leader traits and school characteristics. These measures can be used separately 
depending on their purpose. For more information, see Ebmeier (1991). 

Instructional Activity Questionnaire 

This measure was developed by Larsen (1987) as a performance assessment tool that 
specifically addresses instructional leadership aspects of principals’ work (as cited in 
Heck, Larsen, & Marcoulides, 1990). The measure was developed through an extensive 
review of the school principal effectiveness literature. 

Leadership Practices Inventory 

Kouzes and Posner (2002) developed the Leadership Practices Inventory (LPI) by 
extensively interviewing and surveying leaders, including principals, to identify best 
leadership practices. Thus, LPI views leadership practices as transferrable across 
professional types. What works to inspire people in business settings also may work in 
educational settings. LPI’s domains are as follows: (1) modeling the way, (2) inspiring a 
shared vision, (3) challenging the process, (4) enabling others to act, and (5) encouraging 
the heart. This measure has found widespread appeal across many disciplines, and LPI 
can be completed as an online or print survey. For more information, see Kouzes and 
Posner (n.d.). 

Performance Review Analysis and Improvement System for Education 

The Performance Review Analysis and Improvement System for Education (PRAISE) 
assessment system was developed through an extensive review of school administrator 
effectiveness literature. As such, PRAISE domains are not specifically aligned with 
professional standards. The PRAISE domains are problem solving, relations with 
teachers, and professional qualities and competencies. PRAISE is a print assessment 
to be completed by the principal and his or her supervisor. For more information, see 
Knoop and Common (1985). 
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Principal Instructional Management Rating Scale 

Hallinger and Murphy (1985) developed the Principal Instructional Management Rating 
Scale (PIMRS) to determine the degree to which principals serve as instructional 
managers. PIMRS also provides exemplars of each construct, which may be used by 
raters to identify changes in their own or others’ practices. PIMRS focuses on several 
constructs, including the dedicated use of time for improving instruction, coordinating 
curriculum, and evaluating instruction. For more information, see Leadingware (2008). 

Principal Profile 

The Principal Profile was developed through extensive interview and consultation with 
principals, teachers, superintendents, and department heads. The authors consulted 
with practitioners to establish validity and reliability but also to ensure that the measure 
was practical for use in school/district settings. Two key assumptions inform the tool: 
(1) student growth should be a benchmark for school leader effectiveness and a factor 
in performance evaluation and (2) school leader effectiveness is marked by consistency 
of actions, in that principals need a well-defined set of purposes and the skill and 
knowledge to achieve them on a consistent basis. For more information, see Leithwood 
and Montgomery (1986) and Leithwood (1987). 

Vanderbilt Assessment of Leadership in Education 

Since the Vanderbilt Assessment of Leadership in Education (VAL-ED) was developed 
in 2006, it has become one of the most widely used and respected measures of school 
leadership performance assessment. Like the Diagnostic Assessment of School and 
Principal Effectiveness, VAL-ED assesses principal performance by gathering information 
from principals, teachers, and principal supervisors. The results from VAL-ED produce a 
quantitative diagnostic profile that is linked to the ISLLC standards. VAL-ED is based on 
a thorough examination of the research literature including a conceptual framework within 
which to place the scale. For more information, see Vanderbilt Peabody College (n.d.) 
and Porter, Murphy, Goldring, and Elliot (2006). 


Summary of Findings 

Table 1 synthesizes findings from the review of instruments. In the table, the content 
focus of the assessment (e.g., principal as change facilitator or principal as instructional 
leader) and evaluation approach (e.g., self-reflection survey or 360-degree evaluation) 
are indicated in the column labeled “Approach.” Validity measures and testing methods 
are generally described. In the “Reliability” column, a benchmark of 0.80 was used to 
indicate “moderate” reliability, and a benchmark of 0.90 was used to indicate “high” 
reliability. Any reliability rating below 0.80 was considered “poor.” 
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Table 1. School Leadership Measures 
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Instrument Author(s) Approach Time Required Content and Construct Validity Reliability 
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Findings 

The Internet-based scan of scholarly articles and books conducted identified 20 school 
principal performance assessments, which were intended for use in hiring, advancement, 
and tenure decisions. Of the 20 assessments, eight met criteria for rigor, which meant 
that the assessment development process was transparent and involved some 
psychometric testing, and measures were provided for review. Two of the eight 
assessments were developed in the past decade, and the remainder were developed 
10-20 years ago. 

The scan suggests that, although there is considerable interest in school principal quality 
and accountability, few principal performance assessments have been rigorously developed 
or make details of psychometric testing available for public review. An explanation for the 
finding is that few assessments are being used in the field, but the findings of Goldring et 
al. (2009) suggest that many principal performance assessments of varying quality are 
being used. Unpublished assessments were not included in the scan. 

In addition, the age of instruments raises questions about their continued validity 
for assessing principal performance. Given the emphasis on instructional leadership, 
accountability, data-based decision making, community involvement, and other well- 
documented changes to the school principal position in the past 10 years, it is plausible 
that older measures do not capture essential features of the position. Changes in the 
position and additional research on principal effectiveness raise concerns and may be 
cause for revalidation of older assessments. 

The scan also highlights the different approaches to assessing school principal 
performance. The eight principal performance assessments measure the degree to which 
principals complete different roles. For example, CFSQ addresses principals’ roles as 
change facilitators, VAL-ED focuses on principals as instructional leaders, and PRAISE 
examines principal capacity to improve school-level systems. Each provides test takers 
and principal evaluators with slightly different perspectives on principal practices. 

In addition, the assessments take different approaches to data collection. Several 
measures use self-assessment questionnaires or rubrics that provide an aggregate 
score and help principals to answer the following question: “How do I think I am 
doing, in reference to professional competencies?” Others use more intensive 
360-degree surveys from multiple constituents to create an aggregate profile, which 
can provide comparative information based on multiple perspectives to principals 
about their performance. The use of different constituencies to rate principal 
performance is a growing trend (Lashway, 2003). These evaluations answer the 
following question: “How do I, and others, believe I am doing, in reference to 
professional competencies?" 
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In conjunction with student achievement data, the performance assessments 
that are included in this review hold potential for raising principal accountability and 
identifying necessary changes in practice. However, principal performance assessment 
data will achieve desired ends only if principals and their supervisors view the data as 
credible and actionable and give assessment data considerable weight during principal 
performance evaluations. Close examinations of the principal performance evaluation 
process — its frequency and structure — would provide information about how 
assessments are used. In addition, this process would offer insight for assessment 
developers about how to structure assessment processes for better effects. 
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